Computational Linguistics: Models, Resources, Applications

نویسنده

  • Anna Feldman
چکیده

The main objective of Bolshakov and Gelbukh's new textbook in computational linguistics is to provide students in computer science with a foundation in the fundamentals of general linguistics necessary to develop applied software systems, and to enable students to make informed choices of proper models and data structures. Although Spanish-language applications are emphasized, the textbook contains examples from English, French, Portuguese, and Russian as well. This text is freely available in an electronic format. Supplementary materials, links to the relevant resources, and errata can be found on-line as well. The authors approach linguistic problems using two theoretical frameworks: meaning–text theory (MTT) (Mel'ˇ cuk 1974), and head-driven phrase structure grammar (HPSG) (Pollard and Sag 1994). These choices reflect primarily practical considerations. On the one hand, MTT can be used to describe any language but is particularly well-suited for the description of free-word-order languages. On the other hand, HPSG is arguably the most advanced, widely used, and user-friendly formalism in natural-language description and processing. The structure of the book is as follows. Chapter I gives an introduction to different areas of linguistics and provides an overview of the state of the art in Spanish natural language processing (NLP). Chapter II is a brief survey of the history of linguistics from Ferdinand de Saussure to Leonard Bloomfield to Noam Chomsky, discussing context-free grammars, transfor-mational grammars, and more up-to-date approaches to grammar. Covering concepts such as valence, constraints, and unification formalism, the authors introduce Fillmore's work (Fillmore 1968), generalized phrase structure grammar (Gazdar et al. 1985), and HPSG. The last several sections are devoted to MTT. MTT has influenced the design of new grammar formalisms such as Dependency Tree Grammars, and some aspects have been adopted into other theories (e.g., integrating lexical functions into Pustejovsky's Generative Lexicon; Wanner 1997). Chapter III provides a sketch of computational linguistics (CL) applications. The included discussion serves to illustrate that all of the relevant applications require sophisticated linguistic knowledge. The general lesson of this chapter can be summarized thus: " Most language processing tasks can be considered as special cases of the general task of language understanding, one of the ultimate goals of CL and AI. " Chapter IV describes the MTT theory in more detail and makes interesting comparisons between MTT, HPSG, and Chomskian frameworks. Chapter V introduces the problem of language modeling in CL. The last section of the book includes various exercises, review questions, and …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applications of Rhetorical Structure Theory

Rhetorical Structure Theory is a theory of text organization that has led to areas of application beyond discourse analysis and text generation, its original goals. In this paper, we review the most important applications in several areas: discourse analysis, theoretical linguistics, psycholinguistics, and computational linguistics. We also provide a list of resources useful for work within the...

متن کامل

Computational Linguistics in the Internet Age

Computational Linguistics (CL) has attracted more and more interest in both academic and industry communities in recent years, since it plays an essential role in many Internet applications, including search engines, online translation systems, social networks, and so forth. Almost all CL techniques, ranging from morphological, syntactic, and semantic analysis of texts, to question answering, m...

متن کامل

Statistical methods in language processing.

The term statistical methods here refers to a methodology that has been dominant in computational linguistics since about 1990. It is characterized by the use of stochastic models, substantial data sets, machine learning, and rigorous experimental evaluation. The shift to statistical methods in computational linguistics parallels a movement in artificial intelligence more broadly. Statistical m...

متن کامل

NNBlocks: A Deep Learning Framework for Computational Linguistics Neural Network Models

Lately, with the success of Deep Learning techniques in some computational linguistics tasks, many researchers want to explore new models for their linguistics applications. These models tend to be very different from what standard Neural Networks look like, limiting the possibility to use standard Neural Networks frameworks. This work presents NNBlocks, a new framework written in Python to bui...

متن کامل

The induction of verb frames and verb classes from corpora

Creating lexical information resources manually is an expensive effort: It takes a long time to define detailed lexical knowledge, then the information needs to be updated regularly because of neologisms, sublanguages and language change, and the lexicon will rarely if ever be complete. For these reasons and also given the 10 increasing availability of computing power and corpus resources, one ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2006